Improving i-Vector and PLDA Based Speaker Clustering with Long-Term Features
نویسندگان
چکیده
i-vector modeling techniques have been successfully used for speaker clustering task recently. In this work, we propose the extraction of i-vectors from shortand long-term speech features, and the fusion of their PLDA scores within the frame of speaker diarization. Two sets of i-vectors are first extracted from short-term spectral and long-term voice-quality, prosodic and glottal to noise excitation ratio (GNE) features. Then, the PLDA scores of these two i-vectors are fused for speaker clustering task. Experiments have been carried out on single and multiple site scenario test sets of Augmented Multi-party Interaction (AMI) corpus. Experimental results show that i-vector based PLDA speaker clustering technique provides a significant diarization error rate (DER) improvement than GMM based BIC clustering technique.
منابع مشابه
STC Speaker Recognition System for the NIST i-Vector Challenge
This paper presents a Speech Technology Center (STC) system submitted to the NIST i-vector Challenge. The system includes different subsystems based on PLDA, LDA-SVM, RBM-PLDA and DBN-PLDA. We propose an original iterative scheme for clustering the NIST i-vector Challenge devset. We also introduce the RBM-PLDA subsystem in the NIST i-vector Challenge. Experiments performed on the progress datas...
متن کاملHierarchical speaker clustering methods for the NIST i-vector Challenge
The process of manually labeling data is very expensive and sometimes infeasible due to privacy and security issues. This paper investigates the use of two algorithms for clustering unlabeled training i-vectors. This aims at improving speaker recognition performance by using state-of-the-art supervised techniques in the context of the NIST i-vector Machine Learning Challenge 2014. The first alg...
متن کاملImproving robustness to compressed speech in speaker recognition
The goal of this paper is to analyze the impact of codecdegraded speech on a state-of-the-art speaker recognition system and propose mitigation techniques. Several acoustic features are analyzed, including the standard Mel filterbank cepstral coefficients (MFCC), as well as the noise-robust medium duration modulation cepstrum (MDMC) and power normalized cepstral coefficients (PNCC), to determin...
متن کاملPLDA based speaker verification with weighted LDA techniques
This paper investigates the use of the dimensionality-reduction techniques weighted linear discriminant analysis (WLDA), and weighted median fisher discriminant analysis (WMFD), before probabilistic linear discriminant analysis (PLDA) modeling for the purpose of improving speaker verification performance in the presence of high inter-session variability. Recently it was shown that WLDA techniqu...
متن کاملUnsupervised Domain Adaptation for I-vector Speaker Recognition
In this paper, we present a framework for unsupervised domain adaptation of PLDA based i-vector speaker recognition systems. Given an existing out-of-domain PLDA system, we use it to cluster unlabeled in-domain data, and then use this data to adapt the parameters of the PLDA system. We explore two versions of agglomerative hierarchical clustering that use the PLDA system. We also study two auto...
متن کامل